A Node Linkage Approach for Sequential Pattern Mining

نویسندگان

  • Osvaldo Navarro
  • René Cumplido
  • Luis Villaseñor-Pineda
  • Claudia Feregrino-Uribe
  • Jesús Ariel Carrasco-Ochoa
چکیده

Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT), has better performance and scalability in comparison with state of the art algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detection of Linkage Patterns Repeating across Multiple Sequential Data

Sequential data mining is a technology for acquiring useful information and patterns from large quantities of sequential data. Research into industrial and commercial applications of sequential data mining is flourishing. The aim of this study is to propose a new method for detecting groups of patterns that appear in a linked manner across multiple sequential data and repeat along a time axis. ...

متن کامل

Detection of Linkage Patterns Repeating

Sequential data mining is a technology for acquiring useful information and patterns from large quantities of sequential data. Research into industrial and commercial applications of sequential data mining is flourishing. The aim of this study is to propose a new method for detecting groups of patterns that appear in a linked manner across multiple sequential data and repeat along a time axis. ...

متن کامل

A New Web Usage Mining Approach for Next Page Access Prediction

To engage users of a website at an early stage of surfing, a novel web access recommendation system is essential. In this paper, a new web usage mining approach is proposed to predict next page access. It is proposed to identify similar access patterns from web log using pair-wise nearest neighbor based clustering and then sequential pattern mining is done on these patterns to determine next pa...

متن کامل

Position Coded Pre-order Linked WAP-Tree for Web Log Sequential Pattern Mining

Web access pattern tree algorithm mines web log access sequences by first storing the original web access sequence database on a prefix tree (WAP-tree). WAP-tree algorithm then mines frequent sequences from the WAP-tree by recursively re-constructing intermediate WAP-trees, starting with their suffix subsequences. This paper proposes an efficient approach for using the preorder linked WAP-trees...

متن کامل

Mining of Users’ Access Behaviour for Frequent Sequential Pattern from Web Logs

Sequential Pattern mining is the process of applying data mining techniques to a sequential database for the purposes of discovering the correlation relationships that exist among an ordered list of events. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Discovering hidden information fro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014